Coverage-adjusted entropy estimation.

نویسندگان

  • Vincent Q Vu
  • Bin Yu
  • Robert E Kass
چکیده

Data on 'neural coding' have frequently been analyzed using information-theoretic measures. These formulations involve the fundamental and generally difficult statistical problem of estimating entropy. We review briefly several methods that have been advanced to estimate entropy and highlight a method, the coverage-adjusted entropy estimator (CAE), due to Chao and Shen that appeared recently in the environmental statistics literature. This method begins with the elementary Horvitz-Thompson estimator, developed for sampling from a finite population, and adjusts for the potential new species that have not yet been observed in the sample-these become the new patterns or 'words' in a spike train that have not yet been observed. The adjustment is due to I. J. Good, and is called the Good-Turing coverage estimate. We provide a new empirical regularization derivation of the coverage-adjusted probability estimator, which shrinks the maximum likelihood estimate. We prove that the CAE is consistent and first-order optimal, with rate O(P)(1/log n), in the class of distributions with finite entropy variance and that, within the class of distributions with finite qth moment of the log-likelihood, the Good-Turing coverage estimate and the total probability of unobserved words converge at rate O(P)(1/(log n)(q)). We then provide a simulation study of the estimator with standard distributions and examples from neuronal data, where observations are dependent. The results show that, with a minor modification, the CAE performs much better than the MLE and is better than the best upper bound estimator, due to Paninski, when the number of possible words m is unknown or infinite.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of the Entropy Rate of ErgodicMarkov Chains

In this paper an approximation for entropy rate of an ergodic Markov chain via sample path simulation is calculated. Although there is an explicit form of the entropy rate here, the exact computational method is laborious to apply. It is demonstrated that the estimated entropy rate of Markov chain via sample path not only converges to the correct entropy rate but also does it exponential...

متن کامل

E-Bayesian Approach in A Shrinkage Estimation of Parameter of Inverse Rayleigh Distribution under General Entropy Loss Function

‎Whenever approximate and initial information about the unknown parameter of a distribution is available, the shrinkage estimation method can be used to estimate it. In this paper, first the $ E $-Bayesian estimation of the parameter of inverse Rayleigh distribution under the general entropy loss function is obtained. Then, the shrinkage estimate of the inverse Rayleigh distribution parameter i...

متن کامل

The Optimal Confidence Intervals for Agricultural Products' Price Forecasts Based on Hierarchical Historical Errors

With the levels of confidence and system complexity, interval forecasts and entropy analysis can deliver more information than point forecasts. In this paper, we take receivers’ demands as our starting point, use the trade-off model between accuracy and informativeness as the criterion to construct the optimal confidence interval, derive the theoretical formula of the optimal confidence interva...

متن کامل

Modeling of the Maximum Entropy Problem as an Optimal Control Problem and its Application to Pdf Estimation of Electricity Price

In this paper, the continuous optimal control theory is used to model and solve the maximum entropy problem for a continuous random variable. The maximum entropy principle provides a method to obtain least-biased probability density function (Pdf) estimation. In this paper, to find a closed form solution for the maximum entropy problem with any number of moment constraints, the entropy is consi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics in medicine

دوره 26 21  شماره 

صفحات  -

تاریخ انتشار 2007